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Abstract 

A new family of nonparametric statistics, the r-statistics, is introduced. 
It consists of counting the number of records of the cumulative sum of 
the sample. The single-sample r-statistic is almost as powerful as Stu¬ 
dent’s t-statistic for Gaussian and uniformly distributed variables, and 
more powerful than the sign and Wilcoxon signed-rank statistics as long 
as the data are not too heavy-tailed. 

Three two-sample parametric r-statistics are proposed, one with a 
higher specificity but a smaller sensitivity than Mann-Whitney U-test 
and the other one a higher sensitivity but a smaller specificity. A non¬ 
parametric two-sample r-statistic is introduced, whose power is very close 
to that of Welch statistic for Gaussian or uniformly distributed variables. 


keywords: nonparametric statistics, signal-to-noise ratio, statistical power, AUC, 
record statistics 


1 Introduction 

Nonparametric statistics play a special role in data analysis as they are usually 
more robust and require less assumptions about the underlying data distribution 
[lj. Well-known nonparametric statistics, such as sign and Wilcoxon signed-rank 
for single samples, and Mann-Whitney U-statistic for two samples are however 
much less powerful than the parametric t- or Welch statistics for Gaussian or 
uniformly distributed variables, while the opposite holds for fat-tailed data. 
Here I propose a new type of nonparametric statistics, called r-statistics, which 
is almost as powerful as t- and Welch statistics for Gaussian variables and better 


1 


than the all the above for not too fat-tailed variables. As a consequence, they 
provide a robust alternative to usual statistics. 

Let us write down the definition of the t-statistic as a way to introduce 
useful notations. Take a sample of N values of a quantity of interest, denoted 
by {a; n }, n = 1, • • ■, N, assumed to be independently identically distributed 
(iid). Denoting an estimate with a hat, the t-statistic of the sample is t = 6\fN 
where 6 = jl/a is its estimated signal-to-noise ratio (SNR thereafter), /t its 
estimated average and a its estimated standard deviation. 

The robustness of commonly used nonparametric statistics is due in part 
to the fact that they reduce sample values to integer quantities, such as ranks 
and signs, from which the statistics are computed. The same recipe underlies 
r-statistics which are based on the (integer) number of records of the cumulative 
sum (or equivalently the integrated signal) of the sample values defined as £jv = 
{X t }i< t <jv where X t = Y^ n =i x n, 1 <t < N. If the distribution of x has a zero 
average, X t is nothing else than the position of an unbiased random walker at 
time t. A remarkable result, based on Sparre Andersen theorem [2], states that 
the distribution of the number of upper records (or equivalently the number of 
jumps of the running maximum) in N steps, denoted by I?+, does not depend on 
the distribution of x n as long as it is symmetric (i.e. x and —x are equiprobable) 
and continuous, and the sample values are uncorrelated ;3j; note that the first 
point is always considered as the first upper (and lower) record (see Fig. Q. In 
addition, this distribution is known exactly: 
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which tends to a Gaussian distribution J\f (\/4IV/ it , (4 — 2/n)N) for large N 0]. 
For symmetry reasons, the number of lower records (i.e., the number of jumps 
of the running minimum), denoted by I?_, follows the same distribution. This 
result has spawned many studies on so-called record statistics (see |3J for a 
review). 

2 Single-sample statistics 

Even if single-sample statistic is a well-trodden domain in statistics, using an 
ever so slightly more powerful statistic provides an invaluable advantage in com¬ 
petitive situations such as speculative trading or friend-or-foe identification. 
One of the problems of single-sample nonparametric statistics is that they are 
less powerful than a t-statistic for Gaussian or uniformly-distributed variables. 
The r-statistic remedies this problem while being robust. Note that Sparre 
Andersen’s assumption of symmetric distribution is shared by Wilcoxon signed- 
rank statistic. 

Up to this point, the quantities R+ and R- have two flaws as statistics: 
first, they are bounded from below by zero, thus it is much easier to design a 
statistical test based on their difference Rq = R+ — R -. The number of upper 
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Figure 1: Schematic explanation of the idea behind the r-statistics: one com¬ 
putes the difference between the number of jumps of the running maximum 
(dashed lines) and the number of jumps of the running minimum (dotted lines) 
of the cumulated sums of the sample values, averaged over many random per¬ 
mutations. By convention, the first point counts as a first jump for both the 
running maximum and minimum. The r-statistic r ~ 0.3005 is simply i?o/< T JV 
where <jn ~ 1.97 for N = 6 (see Eq. ([2])). 


3 




























records of A'jv has a straightforward interpretation: R + is nothing else than the 
number of time steps during which is not in a drawdown (i.e., not below its 
running maximum). Thus, the quantity R+ — R— is the time spent in a drawup 
minus the time spent in a drawdown. 

Second, Rq is by definition an integer number, which may be detrimental to 
both statistical power and efficiency. The key new idea is to note that, for iid 
data x n , the integrated signal of any random permutation of {x n } is as valid a 
representation as X^. Thus one can compute the average number of records of 
R 0 over P > 1 random permutations, denoted by Rq. Figure [l] explains this 
idea graphically. 

Let us simply write Rq instead of Rq in the following for the sake of read¬ 
ability. By definition, the distribution of Rq converges towards a Gaussian 
distribution of zero average. Because the numbers of upper and lower records of 
a given random walk are correlated in an unknown way, the standard deviation 
of the distribution of Rq, denoted by must be measured numerically for 
the time being. Extensive numerical simulations (see Appendix A) show that 
ctjv = 1.66(1 — 0.88A r_1 / 2 )^/(2 — 4/7r)iV, thus the single-sample r-statistic is 
defined as 

r = Rq --- ^=. (2) 

1.66(1 - 0.88AT- 1 / 2 ) x /(2 - 4 /t t)N 

Asymptotically P(r ) —> A/"[0, (ojv) 2 ], but the convergence to Gaussian distribu¬ 
tion is quite slow. For example, P(Rq) is Gaussian up to 2 standard deviations 
for N = 1000 (see Appendix A); thus, for the time being, to build a statistical 
test one must resort to estimating the distribution of P(Rq) numerically for a 
given N and use it to obtain p-values. Computations are quick (and the full 
source code is available). 

Assessing the power of the single-sample r-statistic requires to estimate 
P(Rq) for 0 = 0 and for the alternative 0^0 (separately), and then to compute 
the Receiver Operating Characteristic (ROC) curve of the r-statistic 3- ROC 
curves for r-, t-, sign, and Wilcoxon signed-rank sum statistics are reported in 
Appendix [Bj The ROC curves of r-statistics do not cross those of the other 
statistics, hence the Area Under Curve (AUC), a scalar summary of statisti¬ 
cal power measured in ROC curves (the larger, the better), is meaningful for 
comparing the power of r-statistic with that of other statistics. Let us start 
with Gaussian variables. T-statistic is uniformly most powerful in this case [6j, 
hence one expects that its AUC is the largest of all. Figure [2] shows that while 
sign and Wilcoxon statistics are much less powerful than t-statistic for Gaussian 
variables (as it is well known), r-statistic has very nearly the same power as t- 
statistics. Uniformly distributed variables lead to similar results (same figure). 
Generically, the relative power of r-statistic with respect to that of sign and 
Wilcoxon statistics decreases as the tails of the data become heavier. This is 
illustrated in Fig. [3] which reports the AUC versus the i/, the tail parameter of 
Student’s t-distribution (used as a parametric way to obtain heavy-tailed data). 
Wilcoxon statistic becomes more powerful than r-statistic for v ~ 2.5, while 


4 





Gauss 


Uniform 




Figure 2: Area under curve (AUC) versus the signal-to-noise ratio 9 = y/a of 
the alternative; N = 100, 10000 samples per point, 10000 random permutations 
per sample. Error bars set at two standard deviations. Continuous lines are 
only meant for eye-guidance. 


sign statistic wins when v < 3.5. The same behaviour is found for exponentially 
distributed variables (same figure), in which case sign statistic is better than 
r-statistic. 

One of the assumptions of the r-statistic is that the increments have a zero 
average, but this does not tell what the alternative is. When the average in¬ 
crement is not zero, but still comes from a symmetric distribution around its 
average, the average record number of such random walks is a function of the 
signal-to-noise ratio 9 = fi/a 0 (Q [4. Thus the r-statistic is a test for the 
signal-to-noise ratio. 


3 Two-sample situation 

Building a two-sample version r-statistic may be done in several ways. Let us 
denote the two samples by x = n = and y = { y m }, rn = 

1, • • •, N y . Assuming for the time being that N x = N y . the simplest idea is to 
test if the difference of sample elements. If the two samples are paired, then 
z = {z n = x n — y n }, i.e., the same random permutation must be applied to 
both sample elements; otherwise, independent permutations may be applied to 
x and y. the r has zero signal-to-noise ratio, hence average, which amounts to 
computing the r-statistic of { 2 }, denoted R z in the following. It is nonparametric 
by definition. 

Note that if the two samples are paired, then one should 

Another approach is to compute record statistics for each sample and then 
compare them. For example, one can use the difference between the number of 
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Figure 3: Area under curve (AUC) vs the signal-to-noise ratio 9 = [x/a for 
various types of distributions of {x n }. N = 100, 10000 samples per point, 
10000 random permutations per sample. Error bars set at two standard devia¬ 
tions. Continuous lines are only meant for eye-guidance. 


upper (or lower) records of both samples, i.e., 

4 2) = R+(x)-R+(y) (3) 

R ( - } = R-(x)-R-{y). (4) 

This suggests a fourth statistics, Rd = R + — R_ . Given the fact that the 
expected number of records associated with a sample of non-zero average is a 
function of both the signal-to-noise ratio and of the distribution of the sam¬ 
ple values, the distributions of these three statistics have zero average if both 
samples have the same distribution and the same signal-to-noise ratio, which 
is therefore their associated null hypothesis. These three statistics would be 
nonparametric if their standard deviation was nonparametric. This is not the 
case, as shown by Ref. [3] which gives a generic expression of the distribution- 
dependent prefactor of this quantity. Hence, R± and Rd are bound to be 
parametric. 

Figure [i] shows ROC curves for R±* and Rd statistics when both samples 

have the same distribution, the same length and one of them has zero average. 

( 2 ) 

ROC curves for all distributions have common characteristics: generically, R\_ 
has the largest specificity in the limit of large specificity and smallest sensitivity 
in the limit of large sensitivity of all statistics tested here, while the R_ is 
the exact opposite. Rd and R z have approximately the same power as a Welch 
statistic for distributions with mild tails (Gaussian and uniform), but do worse 
than Mann-Whitney otherwise, thus, preferring R z over Rd makes sense. 

There are two subtleties that apply to all two-sample versions of the r- 
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Figure 4: Two-sample situation: ROC curves for Gaussian (left plot) and Stu¬ 
dent t-distributed variables with v = 3.5 (right plot) for the four r-statistics, 
Mann-Whitney U-statistic and Welch statistic. Both samples have a = 1, while 
E{x) = 0 and E(y) = 1 . N = 100, 10000 samples per point, 10000 random per¬ 
mutations per sample. Specificity is equal to 1-false positive rate, and sensitivity 
is the true positive rated 


statistics: first, The second subtlety arises when the two samples do not have 
the same number of elements, denoted by N x and N y respectively. One solu¬ 
tion consists in computing the record statistics of permutations of mm(N x ,N y ) 
elements from each sample. Because of the random nature of the permutations, 
this scheme ensures a fair sampling of the larger sample; another possibility is 
to keep all elements of the larger sample and to resample the smaller sample so 
as to have two samples of equal length. 

4 Conclusion 

While the r-statistics already have some direct applications, three important 
cases still need to be investigated. First, r-statistics as introduced here are 
only valid for uncorrelated data. While there is no exact result about record 
statistics of correlated random walks, numerical simulations point to simple 
corrections in specific cases whether r-statistics remain nonparametric for 
non-iid variables remains to be tested. Practically, a simple modification of 
the way in which r-statistics are computed respects short-range correlation: 
one should permute blocks of data instead of single values, much like block 
bootstraps (e.g. nnnnjttn); the length of the blocks may be found in a self- 
consistent way PI- The second case is discrete distributions. While Sparre 
Andersen theorem is only valid for continuous variables, record statistics of 
random walks with discrete increments is similar (3|. Finally, the case of non- 
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symmetric distributions, e.g., the role of the skew is to be investigated. 

The fact that the r-statistics reflect signal-to-noise ratios is of particular 
interest to finance, because the performance of two assets (or trading strate¬ 
gies) are traditionally assessed with their Sharpe ratio, which is conceptually a 
signal-to-noise ratio. The usual methods are essentially equivalent to a Welch- 
statistic of their difference, computed with more sophisticated methods such as 
bootstraps and generalized moment methods urns]. The latter require the es¬ 
timation of first and second moments of the samples; some of them require the 
third and the fourth moments, which is problematic for asset prices mmm- 
Further work explores the efficiency of r-statistic as a signal-to-noise estimator 

US¬ 
As a final note, the universality of record statistics of unbiased random walks 
with symmetric increments extends to the time between two records [3], which 
however does not yield a more powerful statistics. 

The author thanks Gilles Fay for his suggestions. 

Full source code (R and C-|—|-) available at https: //github. com/damienchallet/ 
rstatistics. 


A Asymptotic behaviour of the single-sample Rq 

A.l Standard deviation 

Numerical simulations were performed for N = floor(10 * (100 1 / 2 °) fe ) for k = 
0,1, • • •, 20, thus N £ [10,1000]: 10 5 samples of R 0 were computed for Gaussian 
variables (10000 random permutations for each sample). Then a non-linear fit 

<tn = V(2 - 4/7t)JV[o(1 - bN~ c )] (5) 

yields a = 1.659 ± 0.008, b = 0.88 ± 0.04, c = 0.5 ± 0.02 (errors set at two 
standard deviations); the goodness of fit is obvious in Figure [5j 

A. 2 Convergence to a Gaussian distribution 

The distribution of r-statistic r = Rq/ctn converges slowly to a Gaussian, as 
illustrated by the qq-plot of Fig. [6] 

B ROC curves 

B. l One sample 

Figure [7] plots ROC curves for the four distributions investigated here. It should 
be noted that r-statistic curves do not cross those of the other statistics. 

B.2 Two samples 

Figure [8] plots ROC curves for the uniform and exponential distributions; those 
for Gaussian and Student’s t distributions are reported in Fig. [4] 




N 


Figure 5: Standard deviation of Rq as a function of N: numerical simulations 
(circles) and non-linear fit (continuous line) ; 10 5 samples perl point of Rq were 
computed for Gaussian variables, with 10 5 permutations for each sample. 


Normal Q-Q Plot 



Figure 6: QQ-plot showing the convergence of the single-sample r-statistic Rq 
to a Gaussian variable as a function of N 
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Figure 7: Single-sample situation: ROC curves for various distributions of {x n }; 
the alternative has a signal-to-noise ratio 9 = fi/a = 0.11, 10000 samples of 
length N = 100, 10000 random permutations per sample. 
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Figure 8: Two-sample situation: ROC curves for various statistics. Both sam¬ 
ples have a = 1, while E(x) = 0 and E{y) = 1. N = 100, 10000 samples per 
point, 10000 permutations per sample. 
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